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Comparative  notched  box  plots  are  developed  to  provide 
confidence  intervals  and  hypothesis  tests  for  the  two  sample 
location  model.  The  notches  are  confidence  intervals  derived 
from  the  sign  test.  Rules  are  given  for  assigning  confidence 
coefficients  to  the  notches  to  yield  a  95  percent  confidence 
interval  and  5  percent  two  sided  test  for  the  difference  in 
locations.  The  test  that  rejects  no  location  difference  when 
the  sign"  notches  are  disjoint  is  shown  to  be  Mood's  median 
test.  Circumstances  under  which  multiple  comparisons  can  be 
carried  out  are  also  discussed.  _ _ 
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COMPARATIVE  NOTCHED  BOX  PLOTS 


by  Thomas  P.  Hettmansperger 
The  Pennsylvania  State  University 

1.  INTRODUCTION  AND  SUMMARY 

McGill,  Tukey  and  Larsen  (1978)  discuss  notched  box  plots 
as  one  way  of  displaying  relevant  sample  information  about  a 
population.  The  box  is  determined  by  the  sample  quartiles 
(hinges)  and  locates  the  middle  half  of  the  population  distribu¬ 
tion.  The  whiskers  are  related  to  the  interquartile  range  (hinge 
spread)  and  are  useful  in  identifying  stray  observations.  The 
notch  portion  of  the  plot  is  an  approximate  confidence  interval 
for  the  population  median. 

When  several  samples  are  displayed  together,  it  is  natural  to 
compare  the  notches  and  make  rough  significance  statements  about 
the  two  population  medians  under  consideration.  A  two  sided,  two 
sample  test  consists  in  rejecting  the  null  hypothesis  of  equal 
population  medians  when  the  notches  are  disjoint.  As  McGill 
et  al.  (1978,  Section  7)  point  out,  if  95%  individual  notches  are 
selected  then  the  significance  level  for  the  comparison  is  less 
than  1%,  much  too  stringent  for  rough  significance  statements. 
Their  solution  is  to  construct  the  notches  by  taking  the  ends  to 
be: 

M  t  1.7  SE  (1.1) 


where  M  is  the  sample  median  and  SE  is  a  sample  estimate  of  the 
asymptotic  standard  error  of  the  sample  median  when  sampling  from 
a  normal  population.  The  factor  1.7  was  "empirically  chosen"  to 
produce,  on  the  average,  a  two  sided  52  test  that  the  two  popula¬ 
tion  medians  are  equal. 

In  this  paper  we  consider  notches  based  on  pairs  of  ordered 
sample  values.  Just  as  the  median  occurs  at  the  middle  of  the 
sample,  the  ends  of  the  notch  occur  at  a  given  depth  from  each  end 
of  the  sample.  For  example  if  n  *  17  then  the  fifth  value  in 
from  each  end  provides  a  95.12  confidence  interval.  See  Noether 
(1976,  Table  E).  Furthermore,  this  notch  is  not  necessarily  sym¬ 
metric  about  the  sample  median  as  in  the  case  of  (1.1).  Asymmetry 
in  the  notch  reflects  additional  information  in  the  sample. 

Most  texts  on  nonparametrlc  statistics  relate  this  notch 
(confidence  interval),  the  median  and  the  sign  test.  The  confi¬ 
dence  coefficient  for  the  notch  is  determined  by  the  binomial 
distribution  (null  distribution  of  the  sign  test).  (Noether  1976, 
Chapter  12;  Lehmann  1975,  Chapter  4;  Hollander  and  Wolfe  1972, 
Chapter  3.)  Thus,  exact  rather  than  approximate  confidence  coeffi¬ 
cients  can  be  associated  with  these  notches.  The  sign  test,  sample 
median  and  notch  can  be  thought  of  as  Interrelated  statistical 
procedures.  In  Section  3  we  will  show  that  comparing  two  "sign" 
notches  is  equivalent  to  constructing  Mood's  two  sample  median 
test  and  associated  confidence  interval.  Thus,  there  is  an  inter¬ 
esting  connection  between  the  one  sample  "sign"  procedures  and  the 
two  sample  Mood  procedures. 


Before  Curning  Co  the  proposed  solutions  we  describe  the  one 
sample  problem  In  the  notation  that  will  be  used  for  the  remainder 
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of  the  paper. 

Suppose  <  ...  <  *(n)  are  the  or^cred  values  of  a  random 

sample  from  a  continuous  distribution  with  cdf  F(x  -  9  ) .  We  will 
further  suppose  6^  is  the  unique  median.  A  y  *  1  -  a  confidence 
interval  for  9^  is  given  by: 

I*v  V  -  X(n-di+1)l 

with  P(S  <  d^)  *  a/2  where  the  distribution  of  S  is  binomial  with 
parameters  n  and  .5,  i.e.,  b(n,  .5).  We  will  refer  to  d^  as  the 
notch  depth.  Noether  (1978,  Table  E)  provides  the  dx  values; 
otherwise,  they  are  easily  found  in  a  binomial  table.  The  central 


limit  theorem,  with  a  continuity  correction,  yields 


d  i~+.5>Z. 


(1.3) 


where  Za/2  is  the  upper  a/ 2  percentage  point  of  the  standard  normal 
distribution.  Since  we  are  dealing  with  a  symmetric  binomial 
distribution  the  approximation  is  adequate  for  sample  sizes  of  at 
least  5.  We  generally  take  dx  to  be  the  greatest  integer  in  the 
right  aide  of  (1.3).  This  means  the  true  confidence  coefficient 
is  bounded  below  by  Y* 


Given  two  ordered  samples  i  ...  i  X^  ^  and  <  ••• 

<  Y,  .  from  F(x  -  9  )  and  F(y  -  9  ) ,  respectively,  we  wish  to  pick 

*  7 


the  two  notches  [L  ,  U  ]  and  [L  ,  0 ]  such  that: 

xz  y  y 


1. 


When  Che  notches  are  disjoint  we  reject  Hq:  A  * 

9-0  ■  0  with  significance  level  a  ■  .05  where 

y  x  c 

a£  is  the  specified  comparison  error  rate  and 

2.  the  differences  in  the  notches  [L  -  U  ,  U  -  L  J, 

y  x  y  x 

provide  a  Yfi  “  1  -  <*c  *  .95  confidence  interval  for 


A  -  9  -  0  . 
y  * 

The  solution,  which  is  developed  in  detail  in  Sections  2 
and  3,  is  quite  simple  provided  the  sample  size  ratio  is  not  more 
than  2  to  1.  The  confidence  coefficients  y  and  y  should  be  chosen  as 

x  y 

close  as  possible  to  .84.  Hence  the  two  sample  test  and  confidence 

Interval  for  A  are  based  on  a  pair  of  .84  "sign"  notches.  If  a 

table  is  unavailable  then  from  (1.3),  with  Z^^  *  1.41  corresponding 

to  y  *  y  "  .84,  take  the  notch  depths  to  be  the  greatest  integers  in 
x  y 


n  »£7  “.  +  1  /n? 

-|  +  .5  -  1.41  -f  i  V-T  *  i  -  1,  2. 


(1.4) 


When  a  significance  level  a£  other  than  .05  for  the  comparison 
is  desired,  the  notch  depths  are  taken  to  be  the  greatest  integers  in 


+  1 


-  Z 


at  /2 
c 


i  -  1,  2. 


(1.5) 


The  corresponding  confidence  coefficient  is  y^  ■  y^  *  1  -  24>(-Zc 

c 

If  the  ratio  of  sample  sizes  exceeds  2  to  1,  adjustments  must  be 

made  in  y  ,  y  ,  and  the  notch  depth.  The  solution  is  given  in  Section 
x  y 

3,  formulas  (2.5)  and  (2.6). 


Before  developing  the  detail a  of  the  solutions,  we  illustrate 
the  approach  on  a  data  set.  The  example  shows  how  in  many  practical 
situations  the  comparative  notches  can  be  used  in  a  multiple  com¬ 
parison  of  several  treatments. 

Example 

He  illustrate  the  comparative  notched  box  plots  on  Tippett's 
(1950)  warp  break  data.  Our  Figure  can  be  compared  to  Figure  F 
of  McGill  et  al.  (1978).  Tippett's  data  consists  of  9  observa¬ 
tions  each  on  6  different  types  of  warp.  An  observation  consists 
in  the  number  of  breaks  in  a  fixed  amount  of  weaving. 

A  notch  depth  of  3  determines  an  exact  822  notch  for  the 
population  median.  From  the  hypergeometrlc  distribution  the  two 
sample  comparisons  have  an  exact  significance  level  of  5.72  (see 
Section  3).  Hence  from  the  figure  we  see  that  al  is  significantly 
greater  than  bh  as  judged  by  a  5.72  Mood  two  sided  test  and  no 
other  pair  yields  significant  differences  at  that  level. 

A  94.32  confidence  interval  for  the  difference  in  population 
medians  (al  -  bh)  is  easily  found  by  taking  the  difference  in  the 
notch  ends:  we  find  (5,  39). 

The  quartiles  (hinges)  occur  at  depth  3  so  the  ends  of  the 
box  coincide  with  the  ends  of  the  notch.  He  have  not  drawn  in  the 
boxes  for  this  example.  The  whiskers  extend  to  the  farthest 
observation  within  one  hinge  spread  of  the  end  of  the  box. 
Observations  beyond  the  whiskers  are  marked  by  o  and  should  be 
investigated  as  possibly  stray  values.  Finally  the  asymmetry  in 
the  notches  should  be  noted  since  this  indicates  stretching  or 
compression  in  the  data. 


We  have  not  attempted  to  control  the  overall  error  rate  for 

the  IS  pairwise  comparisons.  Using  Bonferroni's  inequality  the 

overall  error  rate  would  be  bounded  above  by  15  X  .057  -  .855. 

Since  the  sample  sizes  are  equal  we  could  set  the  comparison  error 

rate  equal  to  aQ/15  where  ctQ  is  the  specified  overall  error  rate. 

Then  the  notch  depths  are  approximated  by  (1.5).  For  example,  if 

a  *  .15  so  that  a  *  .01  and  Z  ,«  *  2.576  we  find  the  depth  to  be 
o  c  a  12 

c 

2  rather  than  3  which  was  used  in  the  example  and  y  *  Y  *  .93.  The 

x  y 

al  and  bh  notches  are  still  disjoint  so  the  comparative  statements 
remain  the  same. 

The  approach  to  multiple  comparisons  of  k  samples  will  work 
as  long  as  the  k(k  -  l)/2  ratios  of  sample  sizes  do  not  exceed  2  to 
1.  For  larger  ratios  the  method  will  not  work  because  more  than 
one  notch  would  be  required  for  each  sample. 

-  Figure  - 

2.  THE  APPROXIMATE  SOLUTION 

We  begin  with  the  specified  comparison  error  rate  and 

derive  the  notch  depth  formulas  and  the  formula  for  determining 

the  confidence  coefficients  y  and  y  . 

x  y 
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Suppose  y  and  y  ,  to  be  determined,  are  the  confidence 
x  y 

coefficients  for  the  two  notches.  When  using  the  approximating 
distributions  we  will  only  consider  the  case 


Y  ■  Y  ■  Y- 
'x  y 


(2.1) 


Hence  a  *  1  -  y  and  the  depth  d^  are  related  by  (1.3)  and  similarly 
for  d  .  In  case  d  (or  d  )  is  not  an  integer  taking  the  depth  to 

y  *  y 

be  the  greatest  integer  in  will  produce  a  slightly  wider  notch 
and  a  slightly  conservative  confidence  coefficient. 

Using  the  same  argument  as  Lehmann  (1963,  Lemna  4)  it  is  easy 
to  show  that  the  lower  end  of  the  X-notch  and  the  upper  end  of 
the  Y-notch  have  normal  approximating  distributions  given  by 


'  *(,!,)  '  n<6«  *  — 3^ 


_) 


2/a^  f(0)  4nxr(0) 

Za/2 


u  mY(n  a  %  n<9v  +  - -  .  - *5 -  ) 

y  (a2-d  +1)  y  2^~  f(0)  4n,f2(0) 


(2.2) 


where  f(0)  is  the  height  of  the  density  of  F  at  the  median. 

One  side  of  the  comparative  test  of  Hq:  A  ■  0x  -  *  0  rejects 

if  L  >  U  .  By  symmetry  of  the  normal  approximating  distributions 

x  y 

and  the  Independence  of  the  two  samples,  the  two  sided  significance 
level  is  approximately 


“c  * 

/n^  +  n2 


(2.3) 


where  $(’)  is  the  standard  normal  cdf 
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Hence  for  a  specified  the  value  of  Za/2*  needed  in  (1.3),  is 


given  by 


\ 


Za/2  *  Za  /2 
c 


/nl  +  n2 


+  t^n^ 


(2.4) 


and  the  notch  depths  are  given  by 


S  +  -5  -  Zn  „  _i 
2  °V2  2 


£7  /  /n,  +  n 


1  “2 


v'n^'  +  i/n^" 


,  i  *  1,  2. 


(2.5) 


The  corresponding  value  of  y  *  Y  “  Y  is  then  found  by  using  the 

*  y 

normal  approximation  to  P(S  <  d)  discussed  under  (1.2).  We  have 


y  -  1  -  2$  [  -Z 


V2 


/  j - - - \ 

/nl  +  n2 


(2.6) 


Let  X  be  the  ratio  of  sample  sizes  and  note  that 


■S  +  °2  -  i/fTT 

/n^  +  ^2  1  +  ^X 


(2.7) 


The  expression  in  (2.7)  varies  from  .7174  at  X  ■  .5  to  .7071  *  1/^2 

at  A  ■  1.  Hence  if  the  ratio  of  sample  sizes  is  less  than  2  to  1, 

we  will  use  .71  ~  l/fi  in  (2.7)  to  get  (1.5)  from  (2.5). 

When  a  -  .05,  take  Z  -  2  and  (1.4)  follows  immediately  from  (1.5). 
c  a  / 1 

c 

Furthermore,  Y  ■  1  -  2t(-\/2)  ■  .84  from  (2.6). 
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In  summary:  If  we  determine  the  notch  depth  from  (1.5)  with 

ZQ  m  2  then  we  have  roughly  84%  confidence  intervals  for  the 
c 

population  medians.  If  we  reject  the  null  hypothesis  of  equal 
population  medians  when  the  notches  (confidence  intervals)  are 
disjoint  then  the  significance  level  of  this  test  is  roughly  5%. 

These  remarks  hold  for  all  but  very  unbalanced  sample  sizes  in 
which  case  (2.6)  provides  the  required  confidence  coefficient 
corresponding  to  depths  given  by  (2.5). 

From  (2.2)  it  follows  that 

tY(d  )  "  X(n  -d  +1)’  Y(n  -d  +1)  ~  X(d  )]  (2,8) 

y  lx  2  y  x 

is  a  confidence  interval  for  A  *  0  -  9  with  confidence  coefficient 

y  x 

Yc  »  1  -  ac  determined  in  (2.3).  Using  84%  notches  yields  an 

approximate  95%  confidence  interval  for  A  «  0^  -  0x«  Hence  we  find 
the  confidence  interval  for  A  by  taking  differences  in  the  ends  of 
the  notches  in  the  notched  box  plot. 

The  natural  point  estimate  for  A  is  simply 

A  ■  med  -  med  (2.9) 

the  difference  in  the  individual  point  estimates. 

3.  THE  EXACT  SOLUTION 

We  first  discuss  Mood's  median  test  for  H„:  A  -  9  -  9  *  0 

0  y  x 

vs.  H^:  A  +  0.  The  test  is  described  in  detail  by  Noether  (1976, 


p.  161).  In  order  to  simplify  the  notation  In  this  section  we 
will  replace  n^  by  m,  the  X-sample  size  and  nj  by  n,  the  Y-sample 
size.  The  essential  part  of  the  median  test  is 

L  «  #  Yi  <  Mc  i  -  1,  2 . n  (3.1) 

where  Mc  is  the  median  of  the  combined  sample.  For  ease  of 

discussion  we  will  consider  the  case  m+n  even  so  that  M  is  the 

-  c 

average  of  the  middle  two  observations  in  the  combined  sample.  The 
null  hypothesis  will  be  rejected  when  L  is  too  large  or  too  small. 
Under  H^:  A  »  0,  L  has  a  hypergeometric  distribution  and  the  tails 
of  this  distribution  determine  the  critical  region. 

Gastwirth  (1968)  and  Pratt  (1964)  have  pointed  out  that  L  can 
be  expressed  in  the  following  form: 

L  -  //(Y  -  X  )  <  0  i  -  1 . n.  (3.2) 

(i)  (=±a  -i+1) 


(We  will  suppose  without  loss  of  generality  that  m  ^  n.)  From  this 
form  (which  is  similar  to  a  one  sample  sign  test  form)  we 
immediately  have  that  the  Hodges-Lehmann  (1963)  point  estimate  of 
A  is 


A  -  med  (Y(i) 


X  __  ) 

(H+n  _i+1) 


(3.3) 


and  the  confidence  interval  for  A  based  on  L  is  determined  by  the 
dth  largest  and  smallest  of  the  differences  in  (3.2).  Just  as  in 


-li¬ 


the  case  of  the  one  sample  sign  test,  the  confidence  coefficient 
Yc  is  related  to  d  by 

a 

P(L  <  d)  (3.4) 

where  y  ■  1  -  ac,  and  L  has  a  hypergeometric  distribution. 

It  is  easy  to  see  that  the  differences  in  (3.2)  are  naturally 
ordered  as  follows:  (recall  m  >  n) 


(1)  ,nrhi. 


W  <  *<2) 
'  2  ' 


X  ,  < 

(=*-!> 


<  Y(d)  *  X 


(3.5) 


(—•  -d+1) 


<  Y.  .  -  X 

(n)  £=a  +i) 


This  means  that  A  in  (3.3)  becomes: 


A  «  med  Y  -  med  X,  , 
i  j  J 


(3.6) 


which  agrees  with  (2.9).  Further  when  d  is  defined  by  (3.4)  a 
100yc%  confidence  interval  for  A  is  simply 


tY(d)  -  X(Jfin  X(n-d+l)  *  X(3cn  w)>- 

Finally  it  should  be  noted  that  under  H^:  A  ■  0 
EL  -  §  and  Var  L  -  ^^TT 


(3.7) 


(3.8) 
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The  normal  approximation,  with  continuity  correction,  can  be 
applied  to  yield,  from  (3.4), 


d  -  2  +  .5  -  Z^/2  J •  (3.9) 

The  two  sided  size  ac  Mood  test  is  equivalent  to  rejecting 

Hn:  A  ■  0  when  0  is  not  in  the  y„  =  1  -  ot  confidence  interval 
u  c  c 

given  by  (3.7).  We  now  turn  to  the  relationship  between  this  test 
or  confidence  interval  and  the  notches  described  in  (1.2). 

We  will  take  apart  (3.7)  in  the  obvious  way:  let  d^  =  d,  d 
determined  by  (3.4)  and  let 


d 


x 


d 

y 


m-n 

2 


(3.10) 


then  (3.7)  yields  the  two  separate  intervals  defined  by  depths  d ^ 
and  dx.  (Compare  to  (2.8).)  The  confidence  coefficients  for  the 
two  intervals  are  given  by  the  binomial  distribution  discussed  under 
(1.2). 


Hence  if  d  is  determined  exactly  by  (3.4)  or  approximately 

by  (3.9)  to  produce  a  two  sided  size  a  Mood  test  then  this  Mood 

c 

test  is  equivalent  to  rejecting  HQ:  A  ■  0  when  the  notches  are 
disjoint  where  d-d  and  d  -  d  +  (m-n)/2. 

y  X  y 

Using  (3.9)  to  approximate  d  and  taking  d  -  d  the  Y- 

y 

confidence  coefficient  is  approximately 


Yy  *  1  -  2*(-Zac/2  / 


(3.11) 


4 

‘  -j '-•>***,•* 


\ 
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while  the  X-confidence  coefficient  is  approximately 


Y 

x 


1 


2*(-Z 


a  /2 
c 


y  m+n-1 


). 


(3.12) 


In  case  m  ■  n,  dx  ■  d^  and  (3.11)  and  (3.12)  yield  Yx  *  Yy  *  .84 

when  ^  is  taken  to  be  1.96  for  a  .05  test.  This  corresponds 
c 

to  (2.7). 

For  m  £  n  a  Mood  test  with  level  around  .05  can  be  constructed 
as  follows:  From  a  binomial  table  or  Noether's  Table  E  select  d 

y 

to  yield  at  or  above  .84.  Then  d^  is  determined  by  (3.10)  and 
will  yield  y^  at  or  below  .84.  Reject  A  *  0  if  the  notches 

are  disjoint.  By  choosing  Yx  £  .84  £  y^  the  level  of  Mood's  test 
is  close  to  .05.  The  exact  level  is  found  from  (3.4)  with 
d  *  d  . 

y 
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